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NETWORKING SYSTEMS 



The present invention relates lo networking systems and the forwarding and routing of information therein, 
being mnrc particularly directed to the problems of a common method for managing both cell and packet or frame 
switching in the same device, having common hardware, common QoS (Quality of Service) algorithms, common 
forwarding algorithms; building a switch that handles frame switching without interfering with cell switching. 

Background of Invention 

Two architectures driving networking solutions arc cell switching and frame forwarding. Cell switching 
involves the transmission of data-in fixed si7.e units called cells. This is based on technology referred to as 
Asynchronous Transfer Mode (ATM). Frame forwarding transmits data in arbitrary sire units referred to either as 
frames or packets. The basis of frame forwarding is used by a variety of protocols, the most noteworthy being the 
Internet Protocol (IP) suite. 

The present invention is concerned with forwarding cells and frames in a common system utilizing common 
forwarding algorithms. In co-pending U.S. patent application Serial No. 581.467. Tiled December 29.1995, for High 
Performance Universal Multi-Ported Internally Cached Dynamic Random Access Memory System. Architecture and 
Method, and co-pending U.S. patent application Serial No. 900.757, filed July 25, 1997, for System Architecture for 
and Method of Dual Path Data Processing and Management of Packets and/or Cells and the Like, both of common 
assignee herewith, a promising solution or common cell/frame forwarding is provided. 

Most traditional Internet-style hosMo-host data communication is carried out in variable size packet format, 
interconnected by networks (defined as a collection of switches) using packei switches called routers. Recently. 
ATM has become widely available as a technology to move data between hosts, having been developed to provide a 
common method for sending traditional telephony data as well as data for computer-to-computcr communication. 
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The previous me.hod employed w„ ,o app.y Time Division Muhip|„ ing (TDM) ,. telephony da,a. wilh each circui , 
.•faced a fued amoun, of ,ime on n channel For examp.e. circui, A may be aHocatcd x amoun, of ,imc (and thus 
da,a>. fo.lo.ed by y and > and ,hcn x again, as la.er described in conncaion wi,h hereinafter discussed Fig. 3. Thus 
each circui, is complc.e.y synchronous. This mchod. however, has in.rinsic .imi.a.ions whh bandwid.h uti.ization. 
since if a circui, has nothing , 0 send i,s al.oca.cd bandwid.h is no, used on ,he line. ATM addresses ,his bandw idlh 
issue by allowing ,he circui.s ,o be asynchronous. Though bandwid.h is stiU divided among fixed .eng.h da.a items, 
any circuit can i/ansmit at any point in time. 

The ITU-T (Inten^on.-,. Te.ccommunica.ions Union - Telecommunica.ions. formally ,he CCITT). is an 
organiza.ion chartered by ,he Uni.cd N.t.ions ,o provide .e.ecommunica.ions s.andards defined four classes of 
service: „ Cons.an, Bi, Ra.e for Circui, Emu.a.io, i.e. cons,an,r,e voice and video: 2) Variab.e Bi, Ra ,e for 
ccr„i„ voice „nd v i(i c„ W ,ic,i MI; 3, D,. (or Connec,io,Orien (ed Trnf fi c; Dnd A) Dal . ,„ Connee , on , etj . 
Orien,cd Traffic. These services, in ,U,n. a,c suppor.cd by cenain "classes - of ATM traffic. ATM moves da.a in 
n«d size uni ,s called cells. Thee ace scvera. ,y P cs of ATM "types". ,hcse are refened ,o as ATM Adaption 
Layers (AAL). these ATM adaption layers a,e defined in .TU-T Recommendation ..363. There are 3 defined 
•ypes: AAL1. AAL3/4 and AAL5. AAL2 Kn, never been defined in ,he ITU-T recommendations and AAL 3 and 
AAL 4 were comb,ned in.o one , T c. Wilh respect to ,he ATM ce.l make-up. .here is no way , 0 distinguish cells ,hat 
belong ,o one layer as opposed ,o cells tha. belong to anolhcr layer. 

The adaption ,.,ver is determined during circui, sc, Up; i,. when a host computer communicates ,o ,he 
nc.work. A, this ,in,c. ,hc hos, co.npu.er informs ,he network of the layer i, wi„ use for a specific visual circui, 
AAL, hns been defined ,o he used for rea,.,ime app.ica.ions such as voice or video: whi.e AAL5 has been defined 
for use by traditional datagram orien.ed services such as forcing IP digrams. A series of AAL5 ce..s „ 
denned ,o make u P a packe, The dcfini.ion of an AAL5 packet consis.s of a s.ream of cells with the FTI bi, se, ,o 
0. except for the las, one (a., ,a,er iMus.rated in Fig. „ This is referred to as a segmen.ed packet. 

Thus, in currcn, nc.working .cchnolngy da.a is transport in ei.her variab.e size packets or fixed size cC.s 
depends on the 1Jrpe , of .Aching devices ins.al.cd in ,hc ne.wor, RouIcrs C3n ^ zM ,„ ^ ^ 
« ...rough AT.1 ne.works. ,f conneced direc,,, then packc, arc orbi.rnr, s,e: but „ eonnecled by AJM Jwi|chej> 
•hen all packe,, exiting ,|,c rou.cr arc chopped in.o fixed ,i« cells of 5? bv.es. 
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Network architectures based on the Internet Protocol (IP) technology are designed as a "best effort** 
service. This means that if bandwidth is available, the data gets through. If. on the other hand, bandwidth is not 
available then the data is dropped. This works well with most computer data applications such as Hie transfer! or 
remote terminal access. This does not work well with applications that can not retransmit, or where retransmission is 
of no value, such as with video and voice. Getting a video frame out of order makes no sense, whereas file transfer 
applications can tolerate such anomalies. Since the packet size is arbitrary at any point in time making specific delay 
variation commitments between any two frames is almost impossible, as there is no way of predicting what type and 
size of traffic is ahead of any oihcr type of traffic. The buffers that must handle the data, moreover, must be able to 
receive the maximum data size, meaning that that buffering scheme must be optimized to handle larger data packets 
while at the same time not wasting too much memory on smaller packets. 

ATM is designed to provide several service categories for different applications. These include Constant Bit 
Rate (CDR). Available Bit Rale (ABR). Unspecified Bit Rale (UBR) and two versions of Variable Bit Rate (VBR). 
real-time and non-real-time. These service categories are defined in terms of Traffic Parameters and QoS Parameters. 
Traffic Parameters include Peak Cell Rale (constant bandwidth). Sustainable Cell Rate (SCR). Maximum Burst Size 
(MBS). Minimum Cell Rale (MCR) and Cell Delay Variance Tolerance. QoS parameters include Cell Delay 
Variation (CDV). Cell Loss Ratio (CLR) and maximum Cell Transfer Delay (maxCTD). As an example. Constant 
Bit Rate CBR (e.g. the service used for voice and video applications) is defined as a service category that allows the 
user at call setup time to specify the PCR (peak eel! rate, essentially the bandwidth), the CDV, maxCTD and CLR. 
The network must then ensure that the values requested by the user and accepted by the network .are met; if they are 
met. the network is said to be supporting CBR. 

The various classes of service direct the network to provide better service for some traffic as opposed to 
other types of traffic. In ATM, with fixed length cells, switches manage bandwidth utilization on a line effectively by 
controlling the amount of data each traffic now is allowed to put on a line at any moment in time. They generally 
have simpler buffer techniques arising from the fact that there is but one size of data unit. Another advantage is 
predictable network delays, especially queuing latencies at each switch. Since all data units are thtsame sire, this 
helps to ensure that such traffic QoS parameters as CDV are easily measurable in the network. In non-ATM 
networks (i.e. frame-based networks), frames can range anywhere from, say, 40 bytes to thousands of bytes, 
rendering it difficult to ensure a consistent CDV (or PDV, Packet Delay Variation) since it is impossible to predict 
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the delays in the network, lacking consistent transfer times of individual packets. 

By carving data into smaller units. ATM can increase the nbility of the network to decrease the latency of 
transmitting data from one host tn another. Such also altoxvs for easier queue and buffer management at each hop 
through the network. A disadvantage, however, is that a header is added to each cell making the effective bandwidth 
cf the network less than if the network had a larger transmission unit. Por example, if 1 .000 bytes are lo be 
transferred from one host to another, then a frame-based solution would append a header (approximately 4 bytes) 
and trnnsmit the entire frame in less than a second. In ATM. the 1 .000 bytes is chopped into 48 bytes with a 5 bytes 
header: i.e. I.0O0/4R = 20.833 (or 2 1 cells). Each cell is then given a 5 byte header increasing the bytes to be 
transmitted by 5 * 2 I = 105 extra bytes. Thus ATM effectively decreases the available bandwidth to the actual data 
hy approximately 100 bytes lor about I07r); the decreasing of end-to-end latency also decreases the available 
bandwidth for d.ita transmission. 

For some applications, such as video and voice, latency is more important than bandwidth while for other 
applicminn<. such as file transfers, better bandwidth utilization increases performance rather than decreased hop-by- 
hop latency. 

Recently, the demands on more bandwidth and QoS have grown many fold due to new applications for 
multimedia services, including the heforc described video and voice. This is forcing the growth of ATM networks in 
the core of traditional packet-based networks. ATM. because of its fixed packet size, brings reduced processing time 
in networks and hence faster forwarding (i.e. lower latency). It also brings with it the ability to take advantage of 
traffic classification. Since the cells, as earlier pointed out. arc of fixed size, traffic patterns can be controlled through 
QoS assignments: i.e. networks can carry traditional packets (in cell format) and constant bandwidth stream data 
(e.g. voice/video based data). 

As will subsequently he demonstrated, most conventional networking systems inherently are designed for 
either forwarding frames or cells hut not both. In accordance with the present invention, on the other hand, through 
use of novel search algorithm*. QoS management and management of packet/cell architecture, both cells and frames 
can be transmitted in the same device and with significant advantage over the prior techniques, as later more fully 
explained. 
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Objects of Invention 

An object of ihe present invention, accordingly, is to provide a novel system architecture and method, 
useful with any technique for processing data packets and/or cells simultaneously with data packets, and without 
impacting the performance aspects of cell forwarding characteristics. 

A further object is to provide such a novel architecture in which the architected switch can serve as a packet 
switch in one application and as a cell switch in another application, using the same hardware and software. 

Still a further object is to provide such a system wherein improved results are achieved in managing QoS 
characteristics for both cells and data packets simultaneously based on a common cell/data packets algorithm. 

An additional object is to provide a common parsing algorithm for forwarding cells and data packets using 
common and similar techniques. 

Other and further objects will be explained hereinafter, and are more particularly delineated in the appended 

claims. 
Summary 

In summary, from one of its important viewpoints, the invention encompasses in a data networking system 
wherein data is received as either ATM cells or arbitrarily-sized multi-protocol frames from a plurality of I/O 
modules any of which can be cell or frame interfaces, a method of processing both ATM cells or such frames in a 
native mode. i.e. not transforming frames to cells, using common algorithms for forwarding based on control 
information contained in the cell or frame and in such a manner as to preserve QoS characteristics necessary for 
correct operation of cell forwarding; processing the packet/cell control information in a forwarding engine with 
common algorithms not dependent on context-sensitive information contained in the cell or packet, and passing 
results Including QoS information to an egress queue manager; passing the cell/ packet to the egress I/O transmit 
facility in such a manner as to provide a minimal cell delay variation (CDV) so as not to impact correct cell 
forwarding characteristics; and controlling the transmit facility so as to provide a common bandwidth management 
algorithm for both cell and packets and all without impacting the correct operation of either cells or packets. 
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Drawings 

The invention will now be described in connection with the accompanying drawings in which the before- 
mentioned Fig. I is a diagram illustrating an ATM (Asynchronous Transfer Mode) cell format; 
« 

Fig.2 is a similar diagram of an Internet Protocol (IP) frame format for 32 bit words; 
Fig.3 is a flowchart comparing Time-Division Multiplexing (TDM). ATM and Packet Data frame 
forwarding; 

Pig. 4 is a block diagram of (he switch of the invention with the cell and packet interfaces; 

Fig. 5 is a block diagram of a traditional prior art bus-based switching architecture, and Fig. 6 f its memory- 
based switch data flow diagram; 

Fig. 7 is a block diagram of a traditional prior art cross-bar type switching architecture, and Fig. 8. its cross- 
bar data flow diagram; 

Fig. 9- 1 0 arc interface diagrams illustrating, respectively, a cell switch with a native interface card, a packet 
interface on cell switch, and an A AL5 packet interface on cell switch, all with a cross-bar or memory switch; 

Figs. 12 and 13 are similar diagrams of a packet switch with native packet interface cards and with AAL5 
interface, respectively, for NxN memory connection buses; 

Fig. 14 is a block diagram of the switch architecture of the present invention, using the word **NeoN" in 
connection with the packet and cell data switch as a trade name of NeoNET LLC, the assignee of the present 
application; 

Fig. 15 and 16 are diagrams respectively of extended parsing function flows for forwarding decisions and 
an overview of such functions and Fig. 17 is a diagram of the forwarding elements; 

Fig. 18 is a first stage parse graph tree lookup block diagram, and Fig. 19 is a second stage forwarding ubh 
lookup (FLT) diagram; 

Figs. 20 and 21 arc respective diagrams of parse graph memory on power up and of a simple illustrative IP 
multicast packet; 

Fig. 22 presents an initialized lookup table, with all entries pointing to unknown route/cell forwarding 
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inferior,, and Fig. 2.1 illustrates .he lookup tabic after adding an illustrative IP address (209.6.34.224/32): and 
Tig. 24 is a queuing diagram for scheduling system operation. 

Further Background To Preferred Embodiment.; nf Invention 

Before procccd.ng ,o il.ustra.e the preferred architecture of the invention, i, is bc.ieved necessary ,o review 
«he limitations of the prior and of current network systems, which the present invention admirably overcomes. 

Current networking soiu.ions are designed cither for switching data packets or cells. As before stated, all 
.ypes of data networking switches mu„ receive data on an ingress port, make a forwaxding decision, transfer data 
from the ingress P on to the egress port and transmit tha, data on the appropriate egress port physical interface. 
Beyond the basic data fnrward.ng aspects, .here are different requirements for cel. switching versus frame 
forwarding. As heforc Mated. a„ current techno.ogy divides switching events into three types: bridges, routers and 
-itches, and in particular. ATM switches. The distinction between bridges and routers is blurred in that both " 
forward datagrams and tvpicaUv most routers a.so do bridging functions as we.l.-thus the discussion focuses on 
datagram switches (i.e. routers) and ATM switches. 

I. is in order f„, ,„ investigate ,h= basic architect, requirements for these two types of switching devices 
based on current so.utions. and then to present the reasons why current solutions do no, provide mechanisms to a.low 
simultaneous transfer of ce,.s and frames without severely impacting „,e correct operations of either ATM switching 
or frame forwarding. The novel solution based on the present invention will then be clear. 

Routers typicallv have a wide variety of physical interfaces: LAN interfaces, such as Ethernet. Token ring 
and FDD., and widcarca interfaces, such Frame Relay. X.25. Tl and ATM. A router has methods for receiving 
frames from these various interfaces, and each interface has different frame characteristics. For example, an Ethernet 
frame may be anywhere from 64 bvtes .0 .500 bytes, and an FDDI frame can be anywhere from 64 by«« to 
^including header and trai.er, bytes. The router's I/O module strips the header tha, is associated with the 
physica, interface and presents ,hc resulting frame, such as an ,P datagram. ,o the forwarding engine. Ue forwarding 
engine ,ooks a, the IP destination address. Fig. 2. and makes an appropriate forwarding decision. The r«u., of. 
forwarding decision is to ,end datagram to the egress pot, as determined by the forwarding tab.es. The egress pon 
.Sen attaches the appropriate network-dependent header and transmits the frame ou, the physica. interface. Since 
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different interfaces may have different frame size requirements, a router may be required to "fragment' * a frame, i.e. 
"chop" the datagram into useable size. For example, a 2000 byte FDDI frame must be fragmented into frames of 
1500 bytes or less before being sent out on a Ethernet interface. 

Current router technology offers "best effort" service. This means that there are no guarantees that 
datagrams will not be dropped in a router-based network. Furthermore, because routers transfer datagrams of varying 
sizes, there are no per datagram delay variation or latency guarantees. Typically a router is characterized by its 
ability to transfer datagrams of a certain size. Thus, the capacity of a router may be characterized by its ability to 
transfer 64 byte frames in one second or the latency to transfer a 1500 byte frame from an ingress port to an egress 
port. This latency is characterized by last bit in, first bit out. 

An ATM switch, by comparison, has only one type of interface, i.e. ATM. An ATM switch makes 
forwarding decision by looking at a forwarding table based on VPI/VCI numbers. Fig. 1. The forwarding table is 
typically indexed by physical port number, i.e. an incoming cell with a VPI/VCI on ingress port N gets mapped to an 
egress port M with a new VPI/VCI pair. The table is managed by software elsewhere in the system. All cells, no 
matter what the ATM Adaptation Layer (AALx). have the same structure, so that if ATM switches can forward one 
AAL type, they can forward any type. 

In order to switch ATM cells, several fundamental criteria must be met. The switch must be able to make 
forwarding decisions based on control information provided in the ATM header, specifically VPI/VCI. The switch 
must provide appropriate QoS functions. The switch must provide for specific service types, in particular Constant 
Bit Rate (CBR) traffic and Variable Bit Rate (VBR). CBR (voice or video) traffic is characterized by low latency 
and more importantly low or guaranteed Cell Delay Variation (CDV) and guaranteed bandwidth. 

The three main requirements of implementing CBR type connections over a traditional packet switch are 
low CDV, small Delay and guaranteed bandwidth. Voice, for example, consumes a fixed amount of bandwidth, 
based on the fundamental Nyquist's sampling Theorem. CDV is also part of a CBR contract, and plays a role into the 
overall Delay. CDV is the total worst case variance in expected arrival time and actual arrival lime of a packet/cell. 
In so far as an application is concerned, it wants to see data arrive equidistant in time. If, however, the network 
cannot guarantee this equidistant requirement, some hardware has to buffer data - equal or more than the worst case 
CDV amount introduced by the network. The higher the CDV, the higher is the buffer requirement and hence the 
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higher Delay: and. as illustrated earlier. Delay i$ not good for CDR type circuits. 

Packcbascd network, ,radi.iona!ly queue data a, .he egress based on priority of traffic. Regardless of how 
data is queued, traffic with low delay variation requirements will get queued behind one or more packets. Each of 
-hem cou.d be maximum packet »«. and this inherently contributes the most to delay variation on a packet-bosed 



network. 



There rue many methodologies used to manage bandwidth and priorities. From a Network Management 
point of view, a network manage, usually .ikes to carve ou, the total egress bandwidth into priorities. There are 
several reasons for carving this bandwidth: e.g. it ensures the manager tha, control traffic (Higher Priority .nd Low 
Bandwidth) always has room on the wire even during very high line bandwidth utilization, or perhaps a CBR 
(Constant Bit Rate) traffic will be guaranteed on the wi, e . etc. 

There are numerous methods to address bandwidth per traffic priority. Broad classes of these mechanisms 
are Round Robin Queuing. Weighted Pair Queuing and Priority Queuing. Each methodology will be exp.ained for 
• he sake of discussion and completeness of this document. ,„ all cases of queuing, traffic is pu, into queues based on 
priorities, usual.y by a hardware engine , ha, .ooks at a cell/packet header or control information associated with 
ce.l/packc, ,s the cell/packet arrives from the backp.ane. „ is how data is extraced/de-qucued from these queues that 
differentiates one queuing mechanism from another. 

Simple Round Robin Queuing 

Ihis queuing mechanism empties all queues in a round robin fashion. This means tha, traffic is divided into 
queues and each queue gets the same fixed band wid,h. While 3 clear advantage is simplicity of implementation, a 
™jor disadvantage of this queuing technique is tha, this mechanism completely loses the concept of priority. Priority 
-» then be managed by buffer allocation mechanisms. The only dear advantage is simplicity of implementation. 
Weighted Round Robin 

This queuing mechanism is an enhancement of "Simple Round Robin Qutu!ng whefe , wejgh , j$ ^ 
on each queue by the network manager during in i,ia,i, alion time . ln ,„-„ mecn3nism ^ ^ ^ . $ ^ 
based on the weigh,. ,f one queue , allocated ,0% of the bandwidth, i, wi„ be serviced .Oft of the time. Another 
.ucue may have 50* of the a.locatcd bandwidth, and wi„ be 50 * of the timc . ^ ^ ^ . ^ 

« -used bandwidth on ,he wi, c when there is no traffic in a queue of the allocated bandwidth. This results in wasted 
HnndwidtH. There is. moreover, no association of packe, si,e in the dequeuing a.gorithm. which is crucia, for 
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packcba.ed SW i,ch„. Giving cqu.nl weigh, ,o ,11 packet si,.cs throw, off , he bandwidth aHoca.ion scheme. 

Priority Queuing 

In .his queuing mechanic output queues are serviced purely based o. priority. The Highest Priority 
Queue ge,s sliced firs,, and ,hc Lo.es, Priority Queue ge.s serviced bst. ,„ ,his mechanism. Higher Priority 
Traf fic a.wavs preempts ,hc Lower Priority QucU e. The drawback of this type of mechanism is tha, the Lower 
Priotity Mechanism mny rc<u„ in ,ero bandwidth. The advance of this mechanism, besides being simple, is ,h„ the 
bandwid,, is no, wasted: so ,ong a, , hc re is da, to send, i, wi„ bc scn , ^ ; , rf ^ 

si.e in ,h. dequeuing a.gori.h,,, which is crucill for p3cke ,. based switchc , Givjng ^ ^ ^ ^ ^ ^ 
throws off the bandwidth allocation scheme, as before noted. 

Rom .he above e.amp.e, there is a need to stnke a balance between Priority Queuing and Weighted 
Round Robin Queuing. a.ong , ilh packet size. This ca.ls for a so.ution provided by the present invention where . 

1,l0C: " M *" 3ddi,i ° n '° lh = ° U 'P"' ^ou.d be m,cd with data from a queue even when 

•he band, idth of tha, queue is exhausted, inching with chcr bandwidth e.tgible queue data. This technique 
enforces bandwidth per traffic queue requirement and a,so does no. waste bandwidth on the w ife and is embodied in 

lhe invention 

Architectural Issues in Switch Design 



are 



Current sw.ch.ng so.utions employ ,w 0 distinct so.utions: , ) memory and 2) cross-bar. These so.utions 
i" U ,ra,cd in F,gs. 5 and 6 showing a .raditiona, bu,based and memory based architecture, and in Fig. 7. showin* . 
tradiiicna! cross-bar switching architecture. 

in the tradition.,, mentory-based so.utions .presented bv Fig. 5. da* must firs, be p.aced inside of main 

memory. Since scvera. different UO modu.es must transfer data to common memory, contention for this resource 
occurs. Main memory provides both a buffering mechanism and a transfer mechanism for data from one physic. 
Pen ,o another P h,,ica, port. T,c rate of transfer is then „i f hlv dependent on the speed of the egress p,n and the 

ahilit v of the system to move data in and out of r 



i mam memory and (he number of interfaces that 



must access main 
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memory. 

As more fully shown in Fig. 6. the CPU interfaces through a common bus. with memory access, wiih a 
plurality of data-receiving and transmitting I/O ports #1.82. etc.. wiih the various doited and dashed lines showing 
the interfacing paths and the shared memory, as is well known. As pointed out previously, the various accesses of 
the shared memory result in substantial contention, increasing the latency and unpredictability, which is already 
substantial in this kind of architecture because the processing of the control information cannot begin until the entire 
packet/cell is received. 

Furthermore, as the accesses to the shared memory arc increased, so does the contention; and as the 
contention is increased, this results in increasing the latency of the system. In the traditional memory-based switch 
data now diagram of Fig. 6. thus, where the access lime per read or write to the memory is equal to M. and the 
number of bits for a memory access is W. the following functions occur: 

There is the write of data from the receive port #1 to shared memory. The time to transfer a packet or cell is 
equal to «B'S)AV)*M. where D is equal to the number of bytes for the packet or cell. M is the access time per read 
or write to the memory and \V is the number of bits for a memory access . As the packet gets larger so does the lime 
to write it to memory. 

This means that if a packet is destined to an ATM interface as in Fig. 5. followed by a cell, the cell is 
delayed by the amount of transfer time from main memory, and in the worst case this could be N packets (where N is 
the number of packet. non-ATM interfaces) including the contention among other reads and writes on the bus. If. for 
example. B=4000 bytes and M is 80 nanoseconds (for a 64 bit-wide bus for DRAM access), then ((4000 * 8)/64) * 
80 = 40,000 nanoseconds for a packet transfer queued before a cell can be sent, and OC 48 is 170 nanoseconds per 
64 byte cells. This is only if there is no contention on the bus whatsoever. In the worst case, if a switch has 1 6 ports 
and all the ports are contending simultaneously, then to transfer the same packet would require 640.000 nanoseconds 
just to get into the memory, and the same amount to get out- a total time of about 1.3 milliseconds. This occurs if 
between each write into memory, another port has to write to memory as well. So for n= 16 ports. n-L or 15 ports 
have to gain access to memory. This means ihnt 15 ports • 80 nanoseconds = 1 200 nanoseconds arevscd by the 
system before the next transfer into memory or the original port can occur. Since there arc '4000 bytes • 8 
bitsVbyte)/64 bits = .MX) accesses, each access is separated by 1200 nanoseconds, and the full transfer takes 500 • 
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I 200 = 600.000 nanoseconds. So the total is system lime plus actual transfer time which is 600.000 nanoseconds ♦ 
40.000 nanoseconds = fi-10.000 nanoseconds for the transfer into memory, and another 6^0.000 nanoseconds out of 
memory. This calculation, moreover, docs not include any CPU contention issues or delay because of egTess pon 
busy, which would make this calculation e\cn larger. 

There arc similar disadvantages in traditional cross-bar based solutions as shown in Fig. 7. before 
referenced, where there is no main memory, and buffering of data occurs both at the ingress port and egress port. In 
the memory-based design of Figs. 5 and 6. buffer memory is shared across all ports, making for very efficient 
utilization of memory on the switch In the cross-bar approach of Fig. 7. each port must provide a Urge amount of 
memory, so that the overall memory of the system is large as there is no common sharing of buffers. The cross-bar 
switch is only a conduit for the transfer of data from one physical port on the system to another physical port on the 
system. If two ports .re simultaneously to transfer data to one output port, one of the two input ports must buffer the 
data thereby increasing the latency and unpredictability as the data from the first input port is transferred to the 
output port. The advantage of a cross bar switch over a memory-based switch, however, is the high rate of data 
transfer from one r°im to another without the inherent limitation of main memory contention on the memory-based 
switch. 

In the traditional cross-bar swiichinc architecture system of Fig. 7, the CPU interfaces through a common 
bus. will, memory access, to an interface with the various dotted and dashed lines of Fig. 8 showing the interfacing 
paths and the shared memory, as is well known. T1,c CPU makes a forwarding decision based on information in the 
data. The data must then be transmitted across the cross-bar switch fabric to the egress port. But if other traffic is 
being forwarded to that egress interface, then the data must he buffered in the ingress interface for so long as the 
amount of time it takes lo transfer the entire cell/packet to the egress memory. There is: 

A. Write or data from the receive port * 1 to local memory. The time to transfer a packet or cell is equal 
to «B-RVW)-M. where D is equal to the number of bytes for the packet or cell. M is (he access time 
per read or write to the memory and W is the number of bits for a memory access . As the packet gets 
larger so docs the time to write it to memory. 

D. Write of data from the receive port ff I to local memory of egress port #2. The time to transfer a 

packet or cell is equal to (( B-R)AV)- M ♦ T. where B is equal to the number of byles for the packet or 
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cell. M is the access lime per read or write lo the memory. W is the number of bits for i memory 
access and T i, .he transfer lime of the cross bar switch.. As the packet gets larger, so does the time to 
transfer it aero,, ihe cross bar switch and write it lo local memory. 

Tor a packet tt.Wcr followed by a cell transfer lo an egress port, the calculation is the same as for the 
memory-based solution of Figs. 5 and 6. The packet must be transferred to local memory at the same speeds as for 
the memory-based solution The advantage that there is no contention for central memory, does not alleviate the 
problem that a packet transfer in front of a cell transfer can cause delays that prevent the proper functioning of very 

fast inierfarc speeds. 

The goal is to create a switching device running at high speeds (i.e. SONET defined rales) that provides the 
required QoS. The device should be scalable in terms of speed nnd pons, and the device should allow for equal-time 
transfer of cells and frames from nn ingress port to an egress. 

While current design, have siai.ed to come up with very high speed routers, they have not. however,been 
able to provide all the ATM service requirements, thus still maintaining a polarized set of networking devices, i.e. 
routers and ATM swi.chc,. An optimal solution is one that achieves very high speeds and that provides the required 
QoS support and has interface, that merge ATM and Packel-bascd technologies on the same interface. Fig. 3. This 
will allow the current inve„mcm in either networking technology to be preserved, yet satisfy bandwidth andQoS 
demands. 

The issues in merging interfaces on a data switch port that accepts ATM cells and treats certain ATM cells 
as packets and others a, ATM now,, accepts only packets on other interfaces and only cells on yet another set of 
interface,, is shown in la,er^i,cus,cd Fig. 4. These issues are three fold: a) Forwarding decision at the ingress 
interface for packet and cell,, h) ,u itching packet and cells through the switch fabric and. c) managing egress 
bandwidth on packet and cell. The present invention, based on this technique of the previously cited co-pending 
applications, explains how to create a general data switch lha, merges the two technologies (i.e. ATM switching and 
packet switching) and solves the three issues listed above. 

Interface Issues Switch Designs 

The purpose of this section is to compare and contrast ATM and Packet-based switch designs and va/ious 
interfaces on either tvpe of ,u ilc |, design. Specifically it identifies problems with both devices as they pertain lo forwarding 
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packets or cells: i.e. issues with ATM switches forwarding packets, and issues wiih Packets switches forwarding cells. Fig. 
3. 

Typical Design of an ATM Switch 

As previously explained, defined within the ATM standard there arc multiple ATM Adaptation Layers ( AAL 1- 

AAL5) . each one specifying a different type of service from a wide spectrum of services: namely. Constant Bil Rate (CBR) 
to Unspecified nil Rate (UBR). Constant Bit Rate (AALI ) contract guarantees minimal cell loss with low CDV. while 
Unspecified Bit Rate contract specifies no traffic parameters, and no quality of Service guarantees. For the purposes of this 
invention it is convenient to limit the discussion to AALl (CBR) and AAL5 (Fragmented Packets). 

Fig. 9 illustrates cell switching with native cell interface cards, showing different modules of a generic ATM 
Switch with native ATM interfaces. The cells arriving from the physical layer module (PHY) are processed by a module 
called Policing Function Module, which validates per VC! established contracts (services ) for incoming cells: e.g Peak 
Cell Rate. Sustained Cell Rate. Maximum Burst Rate. Other parameters such as Cell Delay Variation (CDV) and Cell Loss 
Rate (CLR) are guarantees provided by the box based on the actual design of the caids and the switch. The contracts axe set 
by the network manager or via ATM signaling mechanisms. Cell Data from the policing function then goes, in the 
example of Fie. 9 .to a Cross Bar-type (Fig. 7) or Memory-based Switch (Fig 5). Cells arc then forwarded to the egress port 
which has some requirements of shaping traffic to avoid congestion on the remote connection. To provide egress shaping, 
the design will have to buffer data on the egress side. Since ATM connections arc based on a point-to-point basis, the 
Egress sharer module also has to translate the ATM Header. This is because the next hop has no relationship to the ingress 
VCl/VPI. 

Native Packet Interface on ATM Switch 

As mentioned in the 'Background* section, if an ATM switch is to provide a method that facilitates the routing of 

packets, there have to be at least two points between two hosts where packets and cells networks meet. This means that 
current cell switching equipment has to cart? interfaces that have native packet interfaces, unless the switch is sitting deep 
in the core of the ATM network. It is now in ord er. therefore, to e xamine the design of such a packet interface that connects 
to the ATM switch. 

A typical Packet interface on nn ATM Switch is shown in Fig. 10. elaborating on packet interface on .he cell 
switch. The physical interface would pu. incoming packets into a buffer and then they are fed to the "Header Lookup ind 
Forwarding Engine". The pneke.-based forwarding engine decides the egress port and associa.es a VC! number for cells of 
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.hat packc.. The packc. ,hcn gets segmented into cells by ,he Segmcmation Unit. From there or, «hc packet is .reatcd jus, as 
in the native Cell Switching case, which involves going through a policing function and ,o ,he Switch Buffer before entering 
.he switch. On the egress ride, if .he cel., en.er a ce» in.erf.ee. .hen „,e processing is jus. as chained above (in .he native 
cell i„.erf,ce on ATM switch), .f the cells enter a packet interface. ,hcn ,he ce.ls have ,o be reassembled in.o packets. These 
packets arc then put inln various priority queues and (hen emptied as in the packet switch. 
Two types of packet interfaces on the ATM Switch should be examined. 

AAL5 Interface on ATM Switch 

A Router connected to ATM Switch could segment packets before sending the packet to the ATM Switch. In tha, 
case, packets would arrive a. the ATM Switch in AAL5 format, before described. If the ATM Switch were ,o act as a 
Router and an ATM Switch, i, would have to reassemble the AAL5 Packet and perform a routing decision on it. Once the 
ATM Swi.cWRou.er makes the forwarding decision on the AAL5 packet, i, would then push i, through .he ATM Switch 
after segmenting it again. 

In AAL5. perfect interface on an ATM Switch is shown in Fig. 1 1 . fncoming AAL5 cells are firs, po.iced on a per 
VC! based ,o ensure that ,hc sender is honoring the contract. Once ,he policing funcion is done, an Assembler will 
assemble the cells of a VC« in.o packc.s. These packets are .hen forwarded to the folding engine, which makes the 
forwarding decision on the assemh.cd pocke, and some routing algori.hm. The packet then travels the ATM Switch as 
mentioned in ihe Packet Interface on ATM Switch seciion. above. 

Difficulties in Processing Packets on Cell Switch 

Keeping the goa. of the present tnvention in mind. i.e. ,o achieve s.ric. QoS parameters such as CDV and latency 
and Packet to,,, .his sec.ion w,„ , iM „., di rncu„ies of attempting to design for packets through a traditiona. cel. switch. 

According to Fig. , , . once .he incoming AAL5 segmented packets are assembled and a forwarding decision is 
made, they are .segmented in the -Segmentation Unit". Across the Switch, the AAL5 cel., are ,hen reassembled imo 
packe,s before thcy are shipped on the egress wire. This segmentation and reassemb.y adds ,o the de.ay and unpredic.ab.e 
and unmeasurab.e PDV (Packet Delay Variation) and ce.l toss. As ear.ier mentioned, for packets ,o be provide QoS. i, 
would need to support contract ,ha, includes providing measurable PDV and de.ay. De.ay is caused due to the fact .he cel.s 
have to be rcasscmh.ed. Each reassembly would have .o. in bcs. ca<e. buffer an entire packe. wonh ofdata before calling i. 
compete and sending i, lo .he QoS section. For a 8000 byte packet, for e„mp,e. this cou.d resu.t in 64 UJe c de.ay in 
buffering on a 1 Gigabit switch. 
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Tnc PDV fnr a pnckci through a ceil switch is even more of a concern than the additional delay. The assembly 
process can be processing multiple packets at the same lime from various ingress ports and packcis. nnd this causes art 
unpredictable amount of PDV. essentially hascd on switch contention and (he number of retries of sending cells from 
ingress in egress. 

Cell loss through the switch causes packets to get reassembled incorrectly and therefore adversely affecis 
applications that axe real-time content specific. Most file transfer protocols do recover from a dropped packet (due to 
dropped cells), but it causes more traffic on the switch due lo retransmissions. 

In summary, passing packcis through an ATM swiich docs not provide packcis with the same CDV and latency 
characteristics as cells. Ii simply provides a mechanism for passing a packet path through a cell switch. 

Design of Packet Switch 

A traditional Packet Switch is shown in Fig. I I with native packet interface cards. Packets are forwarded lo the 
Forwarding Engine via the physical interface. T^c Forwarding Engine makes a routing decision based on some algorithm 
and the header of the packet. Once the egress port is decided, (he packet travels to the egress via the Packet Switch, which 
could be designed in one of many ways (e.g. N by N busses, large central memory pool. etc.). On egress, the packets end up 
on different traffic priority Queues. Ihcsc Queues are responsible for prioritizing traffic and bandwidth management. 

Cell Interface on Packet Switch 

The traditional packet switch, shown in Fig. 1 3 with A AL5 interfaces, provides a mechanism to allow cells to pass 

through ihc box so long as the cells are of AAL5 type. There is no practical way of creating a virtual cell switch through a 

tradiiiim.il packet switch, and part nf the present invention deals with Ihc requirements of such an architecture. 

After AAL5 cells arc policed for contract agreements, they arc assembled into packets by an Assembly module. 

The packets thus created are then processed exactly like native packet inicrfaccs. On the egress side, if packets have to go 

out cf the Switch as AAL5 cells, they arc first segmented and then header translated. Finally ihey arc shaped and sent out. 

Difficulties in Processing Cells on Packet Switch: 

There axe problems thai a cell now faces as it traverses a traditional packet switch. Ii is extremely difficult for a 

traditional data switch, such ns a router, to support the QoS guarantees required of ATM. To illustrate the point, reference 

is made to the diagram shown in before-described Fig. 13. One of the biggest challenges for a packet swkch is to support 

AAL! cells. The simple rcuon is thai the traditional Packet-based header Lookup and Forwarding engines do not 
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s.mu.tancous recognize eelU and packets: .herefore. AAL5 cc..s which can be convened into packets are supported. This is 

a severe restriction in ihe capability of ihe switch. 

Among ,he features of cells, are ,he CDV and ,he delay characteristics. Pushing ce.ls through , .rational packet 
swi, ch adds more de.ay nnd an unpredictab.e CDV. TV packet sw llc h. as is inherent in its n3m e. implies that packets of 
various si,es and number, arc queued up on the switch. Packetizcd cells w 0uld then have no chance of maintaining any , ype 
of reasonable QoS through Ihe switch. 

Preferred Embodiments of the Invention 

The present invention, cxemplarily illustrated in Figs. 4 and 1 4. and unlike all these prior systems, 
optimizes the networking system for transmitting both cel.s and frames without internal.y convening one into the 
Cher. Furthermore, i, maintain, the strict QoS parameter, expected in ATM switches, such as strict CDV. hteney 
and cel. .os, THis is achieved hy having , common ingress forwarding engine tha, is context independent, a switch 
fabric ,h» ttansfers ce.ls and frames with similar latency, and a common egress QoS engine- packets Howing 
through the architecture of the invention acquiring cel. QoS characteristics whi.e the cel.s still maintain their QoS 
characteristics. 

The main components of the novel switch architecture of the invention, sometimes referred to herein by the 
acronym for the assignee herein. "NcoN." as shown in R| . K comprjse (hc ingr „ s par( ^ ^ ^ ^ ^ 
egress par,. The ingress r ar, is comprised of differing physical interfaces ,ha, may be cel. or frame. A cell interface 
fur.hermorc may be either pute cc.l forwatding or a mixture of cel. and frame forwarding where a frame is comprised 
of a collection of ce.ls as defined in AAL5. Another par, of the ingress component is the forwarding engine which is 
common to ho-h ceUs and fratnes. The switch fabric is common to both cells and frames. The egress QoS is a.so 
common to hoth ce.ls and frames. The fina. par, of egress processing is the physica. layer processing which i, 
dependent on the ,ype of interface. Thus. ,he NcoN switch architecture of the invention describes .hose pans that « 
common to both cell and frame processing. 

The key parameters required for ATM switching, as ear.icr explained, and ,ha, are provided even i„ the 
case of simuhnneous packet switching are predic.ab.e CDV. low Latency, low Cel. Loss and bandwidth 
management: i.e. providing a guaranteed Peak Ce.l Rate (PCR). The architecture of the invention. Figs. 4 and .4. 
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however, contains two physical interfaces AAL5/I and packet interface al the ingress and egress. The difference 
between the two types of interface is the modules listed as "Per VC Policing Function" and "Per VC Shaping". For 
cell interfaces ( AALI-5). the system has lo honor contracts sel by the network manager as per any ATM switch and 
also provide some sort of shaping on per VCI bases at the egress. Besides those physical interface modules, the 
system is identical for a packet or a cell interface. The system is designed with the concept that once the data 
traverses the physical interface module, there should be no distinction between a packet and cell. Fig. 14 Hsu the 
core or the architecture which has three major blocks, namely. "Header Lookup and Forwarding Engine". "QoS". 
and -Switch" rabric. that handle cells and packets indiscriminately. The discussion, as it relates to this invention, lies 
in the design of these three modules which will now be discussed in detail. 

Switch Fabric 

The inventions presented in before-referenced co-pending U.S. patent applications Serial No. 581.467. and 
Serial No. 900.757. both of common assignee herewith, optimize the networking system for minimal latency, and can 
indeed achieve zero latency even ns data rates and port densities are increased. They achieve this equally well, 
moreover, for either 53 byte cells or 64 byte to 64K bytes packets through extracting the control information from 
the packet/cell as it is being written into memory, and providing the control information to a forwarding engine 
which will make switching, routing and/or filtering decisions as the data is being written into memory. 

Native Cells through the Switch 

The switch cells (AALI/5, of Fig. 14 are firs, policed a. 2 as per the contract the network manager has 
installed on a per VCI base. Ihis module could aiso assemble AAL5 cells into packets on selected VCI. Coming out 
or the policing function 2 are either cells or assembled packets. Beyond this juncture or the data flow, there is no 
distinction between a packet or a cell until the data reaches the egress pon where data has lo comply with the 
interface requirements. The cells are queued up in the "NeoN Data Switch" 4 and the cell header is examined for 
destination interface and QoS requirements. This information is passed on to the egress interface QoS module 6 via a 
Control Data Switch. so..ahe.cd a, ft .The QoS for . cell-type interface will simply ensure that cel. rates beyond the 
Peak Cel. Rate are clipped. The cells are then forwarded to the "Per VCI Shaping" module 10. where the cells are 
forwarded ,„ the physical interface after they are shaped as per the requirements of the next hop switch. Since the 
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QoS module 6 docs no, know from .he con.rol data whe.her a packet or a cell is involved, i, simp | y rcqucsls lhe dala 
from ,he NeoN Switch into ,h= - Buffer 12." Tnc con.rol data informs .he "Per VCI shaping" h | oc k 10 ,odo either 
header transition if i, were a cell going into another VCI tunnel, and/or segmentation if the data was a packet going 
out on n cell interface incl/nr reform shaping as per the remote end requirements. 

Native Packets through the NeoN Switch 

As packets enter the interface card, the packet header is examined by a Header Lookup and Forwarding 
Engine module U.whilc .he data is sent to the NeoN data switch 4. The Ingress Forwarding Engine make, a 
forwarding decision nhou. the QoS and the destination interface card based on the incoming packet header. The 
Forwarding Engine 14 also gathers nil information regarding the data packet, like NeoN Switch address. Packet QoS. 
Egress Header Translation information, and sends it across to the egress interface card. This information is carried as 
a control packet to the egress P or. through the small non-blocking control data switch 8 to the Egress QoS module 6. 
which .ill qocuc data as per the control packet and send i, to the module listed PHY at the egress. If the packet were 
to egress to , cell interface, .he packet will be segmented, then header translated and shaped before it leaves the 
interface. 



Advantages of the NeoN Switch Architecture of the Inv 



ention 



As seen above, cell and packet now through the bo, without any distinction except a, the physical 
interfaces, such ,hn, if cell characteristics are maintained, then packets have the same characteristics as the cells. The 
packets „«v thus have measurable and low POV (Packet Delay Variation) and low latency, wi.h the architecture 
supporting packet switching wi.h cell characteristics and ye, interfacing to existing cell interfaces. 

While the traditional packet switch is unable to send non-AALl cells as before exp.ained. AAL5 cells .Iso 
suffer an unpredictable amount of PDV and delay - this being obviated by the NeoN Switch of .he invention. 
Packets through a .raditiona. ATM Switch also suffer the same long delays and unpredictable CDV - again, no. ,he 
case in the NeoN Switch of the invention. The modules .ha, make .his .ype of hybrid switching 0 r .he i 
possible include .he Ingress Forwarding Engine, .he Egress QoS. and ,he Switch Fabric. 



invention 
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Ingress Forwarding Engine Description 

The purpose of .he Ingress Forwarding Engine 14. Fig. 14. is to parse the input frame/cell and. based on 
predefined criteria and contents of the frame/cell, make a forwarding decision. This means that the input cell/frame is 
compared against items stored in memory. If a match is determined, then the contents of the memory location 
provides commands for actions on tlic cell/frame in question. The termination of the search, which is an iterative 
process, results in a forwarding decision. A forwarding decision is a determination of how io process the 
aforementioned frame/cell. Such processing may include counting statistics, dropping the frame or cell, or sending 
the frame or cell to a set of specified egress pons. In Fig IS. this process is shown at a gross level. An input stream 
of four characters is shown b.c.d.e. The characters have appropriate matching entries in memory, with a character 
input producing a pointer to the next character. The final character b produces a pointer to a forwarding entry. A 
different stream of characters than that illustrated would have a different collection of entries in memory producing 
different results. 

The proposed Ingress Forwarding Engine 14 is denned to be a Parsing Micro-Engine. The Parsing Micro- 
Engine is divided into two parts - a.fae«i»e part and a passive part. T1,e active part is referred to as the parser, being 
logic that follows instructions written into the passive memory component which is composed of two major storage 
sections: I )P a , S e Graph Tree ( PGT). Fig. 1 8. and 2) Forwarding Lookup Table (FLT). Fig. 1 9. and a minor storage 
section for statistics collection H,c Parse Graph Tree is storage area that contains all the packet header parsing 
information, the results of * hich is an offset in the Forwarding Lookup. The FLT contains information about the 
destination port, multicast information, egress header manipulation. The design is very flexible, e.g. in a datagram, it 
can traverse beyond the DA and SA fields in the packet header and search into the Protocol field and TCP Port 
number, etc. The proposed PGT is memory that is divided into the 2" blocks with each block having 2" elements 
(where m < n ). Each element can be one of three types- branch element, kar element, or skip element and within 
each block, there can he any combination of element types. 

While particularly useful for the purpose of the present invention, the Parsing Micro-Engine is generic from 
the standpoint that it examine* an arbitrary collection of bits and makes decisions based on that comparison. This 
can be applied, for e.ample. to any tcx.-searching functions, searching for certain arbitrary words. In such 
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applications, as an illustration, words such as -bomb- or "detonate" in a leer or email may be scarchedand if a 
march is dc.ec.ed. .he sc.-uch engine may .hen execute predetermined functions such as signaling an alarm. In Tact 
the same memory can even be used to search fnr words in different languages. 

In the comm. of the invention. Fig |4 illustrates having ,w 0 entry points. One en.ry pom, is used to search 
for iki in one language, while the second entry point is used to search for .ex. in another language. Thus .he same 
mechanisms and the same hardware arc used for two types of searches. 

There are two components to .he datagram header search. software component and the hardware 
component. The software component creates the elements in .he Parse Graph for every new ,o Ule it finds on ,n 
interface. The software has to create a unique graph s.aning from a Branch Element and ending on a Leaf Elemen,. 
b.er defined, for each additional new route. The hardware walks the graph from branch to Leaf Element, clueless 

nhoul the IP header. 

In he, .here can be many entry poims in ,he memory region as illustrated in Fig. 21. The initial memory 
can be divided in.o mul.iple regions, each region of memory being a separa.e series of instruc.ions used fordifferen. 
applications. In the case of Fig. 22. one of jh = regions is used for IP forwarding while the other region is used for 
ATM forwarding. A, sys.cm s.a,,. .he memory is initialed to point to "unknown route", meaning that no forwarding 
information ,s available. When a new en.ry is inser.ed. .he s.ructure of the Lookup Table change, as i.lus.rated in 
Fig 23. The iUustra.ne ,P address 209.6.34.224 is shown inserted. Since this is a byte-oriented lookup engine, .he 
nrs, block has a poin.er inser.ed in .he 209 location. The poin.er points to a block ,ha, has a new pointer value in .he 
6 location, and so on until .,. of the 209.6.34.224 address is inserted. A., other va.ues sti.l point to unknown route. 
Inserting the address in the IP portion of memory has no impact in the ATM portion or memory. 
As mentioned e.v.ier. there arc 2" blocks each with 2" elements in the parse graph ,rce. The suuc.ure of each elemen, 
is as shown in Fig. 17. with each element having the following fields. 

I. Instruction Field: In ,he curren, design thee are three instructions resuhing in two bi, instruction field. The 

instruction description is as follows. 

• Branch E.emen, (00). .n so far as ,he Micro Engine is concerned, .he branch elemen. essemial.y points 
•he Forwarding Engine to the „„, Hock address. Also, within the branch element, the user may se, 
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Held, in ,he -.ncremcn.a. Forwarding W . Field.' Pip. , 8. and upda.c vanous 
«c.u,ive c.emcn.s of .he fin,, Forwarding ,nforma,io. For example, if lhe micr0 engine _ parsing 
,n IP header, nnd .he brnnch elemen, was placed al (hc end of ^ dcsl|na|jon ^ ^ ^ ^ ^ 
upd.c ,he eg,c,< por, Held of ,hc forwarding info. For ATM swi.ching. ,he user would upda(e the 
cp;csx port information at the end of parsing lhe VPf Held. 

• Leaf E.cmen, ,0. ,. Thi, e.cmen, ins.ruc.s ,he end of parsing ,o ,he miero en?ine . The forwarding 
informa.ion accumula.ed during , hc search is ,hen f„rw arded to , hc n „, , ogica , ^ ^ ^ 

• Skip E.cmen, ( ,0). This e.cmen, is provided ,. speed up ,he parsing. The ,ime i, , akes l0 p3rje , 
^ header depends on ,he number of b.ock addresses ,hc micro engine has <o look up. No, every 
sc.uen.i3l Held in .he incoming bender is used ,o make . decision. „ ,he skip elemen, w ere no , .here 
■hen .be n,icro engine would have ,o keep hopping on non-signiHcan, fields of ,he incoming s.ream. 
odding ,o parsing .ime.n.c skip elemen. allows ,nc miero engine (0 $kjp fieI(Js ^ 

dn.ngra.n nnd continue ,hc search. The skip zb.z is described below. 

2. Skip Fic.d: This Held is Mpee i»|, y used for ,he skip e.cmen, This aUows ,he parser ,o skip incoming 
d .,^m bender ICds ,o a„ow for fas.er searching. ,n an IP bender, for cample, if ,he user wan.ed ,o forward 
packc, ba.d on DA bu. coun. s.a.is.ics based on ToS fType of Service) Held. „ w 0U , d parse ,he en.ire DA and 
•Hen s.c P ,„ lhe ToS Field. Tnis makes for a fas.cr Forwarding Engi „ c . ^ si ,. e of ^ M ^ 

.o a„ow for ,he ,rges, ,ki P ,hn, ,he user wou.d ever need for i.s.dn, swi.ching bo, which could be based on 

the protocol, etc. 

3. .neremen.nl Forwarding ,nfo Fie!d: During header parsing. f 0rw3fding informa|ion ;$ ^ 
Awarding informa.ion mav have manv mu , U a, ly elusive fie.d, Tlic Forw arding E „g ine sh 0uld be n«ib.c 

ne.d. Hhering cou.d be decided on ,he source address. w ith QoS decided bascd ,„ ^ T0$ ^ ^ 
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■ he con.ro! path The width of the incremental forwarding information (hereafter referred to as IFI) should be 
equal in ihc number of mutually exclusive incremental pieces in ihe forwarding information. 

4. Next Block Address Field: ThU field is the next block address ,o lookup after the current one. The leaf node 

instruction ignores this field. 

5. Statistics Offset Field: In daia switches, keeping flow statistics is as crucial as the switching data itself. 
Without keeping flow statistics it would be difficult. « best, to manage a switch. Having this statistics offset 
field allow* one to update statistics a. various points of the parse. On an IP Router, for example, one could 
collect packet count on various groups of DA. various Croups orSA. all ToS. various protocols etc. In another 
example dealing with an ATM switch, .his field could allow the user to count cells on individual VPI or VCI or 
combinations .hereof. If the designer wants to maintain 2'coun.ers. then the size of this field should be s. 

6. FLT Offset Field: Tl.is is an offset into the Forwarding Lookup Table. Fig. 18. later discussed in more 
detail. The Forwarding Lookup fable has all the mu.ua.ly exclusive pieces of information tha, is required ,o 
build the final forwarding information packet. 

Reference Hardware Design Example 

The following j s 3 „ example of a hardware reference design for the parser useful with practice of the 
present invention. The reference design parser has storage that contains the packct/cc.l under scrutiny. This storage 
element for the cell/frame header information is to be two levels in depth. This creates a two-stage pipeline for 
header information into the destination lookup stage of the Ingress Forwarding Engine. This is necessary because the 
Ingress Forwatding Engine will no, he able to perform a lookup until the entire header information has been stored 
due to the flexible starting point capability. Tlte ,w 0 stage pipeline allows the Ingress Forwarding Engine to perform 
> lookup on the present header information and also stores the next header information in parallel. When the present 
header lookup is completed, then the next header lookup can proceed immediately. 

The storage clcmcn. s.ores a programmable amount of the incoming bit stream. As an example, the 
configuration may be 64 bytes for IP datagrams and 5 bytes for cells. For an interface tha, handles both cells and 
frames, the mruimum of these two values may be used. 

A DMA Transfer Done signal from each DMA channel will indicate to a state machine thu i, can begin 
snooping and storing header information from the Ingress DMA bus. A packet/cell signal will indicate that the 
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header ,o he „:,rcd is ci.hcr a packc, header or a cel. header. Who hcndcr inform . 1(ion ^ ^ ^ 
from a DMA channel, n request lookup will he asserted. 

For header .ookups. ...ere nil, he 3 rcgis.cr-based table which will indicalc IO |he , nprcss 
Engine , hc lookup s.arting pnin, in ,„c IP Header Table. The Ingress Forwarding En?ine ^ ^ ^ 
number ,„ indes .hi, uhle. .His informal allows the In? re„ Forwarding Engine ,o s.ar, ,he seaxeh a, any fi e ,d in 
,he , P Header or fields cnn.ained in ,l,c da, por.ion of .he packe,. This capabili.y. 3 ,on g wi.H lhe skin fune(ions 
coined. *i„ aHo. ,„. InprC ss rnr.a,di„g En.eine .0 search any fields and string .hen, logc.her ,„ form complex 
filtering case* per interface. . 

A suitah.e hardware .nokup is shown in F „. .9 using , Parsc Ttec Craph |ookup ^ _ 

folding dcosio, Thi, a.gori.hm parses ei.hcr a nibb.e or a by, a. a „ne of ei.her an ,P dcs.ination address or 
VPI/VCI header. Tn,s c,pahi., ly is programmable by sof.warc. Each lookup can have a uniquc Iree slructure ^ 
is poin.ed ,o b y one of sii.ccn o.i P i Mli „ P nodes, one per interface. Tlic originating nodes arc s.ored in a 
programmable rcg.stcbascd .able, allowing software .o build .hese trees anywhere in .he memory struc.urc. 

A ni„M= or h, c Wupra in eilhcr 3n end n „ dc r „ u|i ^ a podc rcsu|( ^ looku[icomroi 

»« machine ^ ,e .o„ku P process hy earning lhc M3(us „ 3g „.„ ^ ^ ^ ^ ^ 

su, machine .o skip a number of nibbles, indicated by .He skip size, during .he ,o„k U p. ^ bank $clec , nag „,„ 

clock cnnhlc< and mux controls lo ncii\ ate. 

Th. resu.. of .he Parser ,o 0 ku P is .he Forwarding Table .ookup ,, lic H is a Hank of memorv yie.ding .he 

folding re„, inc.uding ,e fo f ,a,ing inmr.a.ion ca.led .he Forcing ». , n ordcr lo oplim , c , ookop ^ 

Performance, .his , 0C , up s.age can be pipe.ined. allying .He firs, s.age anolhcf lookup ,„ 

Forwarding ID Held will be used in several wavs Firs. .h. Men ,w r- 

1 F ' rM - ' hc MSB < Mos ' S.gnificam By.c) of .he field is used lo 

indicate a ur.icasi or n.-jllicas. packet a. the rework interface Irvrl IV, , • 

interlace lex el. For mulucxs. packeis. for example, lhc Egress 
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Queue Manager wi„ need IO lnnk aI [his hh fQr qucujng of mu „ icaM p3ckc(s |q . pic ^ ^ 

packets, for example, si, hi.sof ,hc Forwarding ID can indict the destination interface number and the remaining 
.6 bits wiM provide a Ln.vcr 2 ID. Thc Layer 2 ID wi„ be used by the Eprcss Fmfyg ^ ^ ^ 
Layer 2 header need., to he prepended to .he packet data. For packets, these headers wi„ be adde d ,o the packet as « 
is moved from ,he Egress DMA FIFO (first in. f.rs, out, ,o the Egress Duffer Memory. For cells. ,he Layer 2 !D wi„ 
provide the transmit device with the appropriate Channel ID. 

For unicas, traffic, the Destination l/F number indicates the network destination interface and the Layer 2 
ID indicates w„a, type of Layer 2 header needs to added onto the packet da,, For multicast, the multicast ID 
indices both the type of Layer 2 header addition and which network interfaces can transmit the mu.ticast. Tne 
Egress Queue Manager will pc, f 0 , m a Multicasl 1D IabIc , ookup , o dclermine ^ ^ ^ ^ ^ ^ 

transmitted on and what kind of Layer 2 header is put back on thc packet data. 



An Example of Life of a racket Under the Forwarding En 



g engine 



I. is now in order , 0 expIa ; n cxamp , cs of , simplc ^ a ^ ^ ^ ^ 

-he invention. On power „ P . Fig. ,9. a „ 2" b.ocks of the parse graph „, Riled wi,h leaf elements pointing to an FLT 
offset U.a, wi „ e venIually forwarJ a|| ^ |Q ^ M ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

of all unreco P ni,ed packets. Sof.warc is responsible of setting up the default route. Thc way i„ which the various 
e.emcn,s Me updated into „,is pw « praph mcrnory tti|) hc ^ ^ rf ^ ^ 

P:,CkCl * i,h mask 255.255.0 ymd a complex filter packet, aging the 

simple IP Packet. 

Simple Multicast Packet 

On power up. the entire blocks in the Parse Graph Memory may be assumed to be filled with leaf Cements 
** Point to ,« offset of H.T which wi„ rouIe lhe paekel l0 lhe ^ ^ ^ ^ . ^ ^ ^ ^ ^ 

cample, tha, the ingress packet has a destination IP Address of 224.5.6.7. In this case, the hardware will .ookup the 
224* offset in the .» block (the firs, lookup block is also called originating node) and find a leaf. The hardware wil. 
end the sevch and look up the default offset in the 224* location nnd look up the FLT and forward the packet ,o .he 

conirol processor. 
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When ihe control procc^cr forward* subsequent packets of Destination IP address 224.5.6.7. it will 
generate the graph shown in Pip. 2 I. 

Ihc software first has to create the parse graph locally. The parse graph created is listed as M29-M3I. 
lite software always looks up the first block a.k.a originating node. The offset in the first block is 224. which is the 
fi.M hj ie of the destination IP header. It finds a default route. - an indication for software to allocate a new block for 
all subsequent bytes of the destination IP address. Once the software hits a default route, it knows that this is a link 
node. From the link node onwards, the software has to allocate new blocks for every byte it wants the hardware to 
search for a matched destination IP address. Through an appropriate software algorithm, it finds that 129. 2. 131 are 
the ne.t three available blocks to use. The software will then install continuation element with BA of 2 in the 5* 
offset of block 129. continuation element with BA of 13 I in 6* offset 0 r block 2. and a leaf element of FLT offset 5 
a. 7" offset of block 131. Once such a branch with a leaf is created, ihe node link is then installed. The node has to 
be installed las. in the new leafed branch. The node in this ease, is a continuation element with BA of 131 at offset' 
224 of the I "block. 

IV hardware is now ready fo, any subsequent packets with destination IP address 224.5.6.7. even though it 
knows nothing about it. Now . when the hardware sees the 224 of Ihc .he destination IP address, it goes to the 224 1 " 
offset of I" block of the parse graph and finds a continuation element with BA of 129. The hardware will then go to 
.he 5* offset (second byte of destination IP address) of the 129* block and find another continuation element with 
BA of 2 The hardware will then go ... <V k offset (third byte of destination IP address) or the 2- block and find 
another continuation element with P>A of 1 3 1 . hardware will then go to 7* offset (fourth byte of des.inaiion IP 
address, of the .31" block and find a leaf element with FLT of 3. The hardware now knows tha. i, has completed the 
IP match and will forward the forwarding ID in location 2 to the subsequent hardware block, calling the end of 
packet parsing. 

It should he noted tha. the hardware is simply a slave of the parse graph pu, in memory by softw.re.The 
length of the search purely depends on ,he software requirements of parsing length and memory si,e. The adverse 
effects of such parsing are si,e of memory, and search time which is directly proportional to the length 0 rthe search. 

In this case, the search will result in the hardware effecting 4 lookups in Parse Graph and I lookup in FLT. 

Tackct niih MaO; 255.255.255.0 
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Building upon a the parse graph in Fig. 20. a packet with an illustrative mask 255.255.255.0 and address of 
4.6.7.x is now installed. In this caw. the software will go to the 4 th offset in the originating node and find a 
continuation clement with B A of 1 29. The software will then go to offset 6 in block I 29 and find a default FLT 
offset. The software then knows that this is a link node. From now on. it has to allocate more blocks in the parse 
graph, such as block 2. At offset 7 of hlock 2. it will install a leaf element with FLT 3. Then it will install the link 
node consisting of writing a continuation clement with B A of 2 at offset 6 of block 1 29. 

When the hardware receives any packet with the header 4.6.7.x. it will look into the 4 ,h offset originating 
node and find a continuation element with BA of 129. then look at the 6 ,h offset in block 129 and find a continuation 
element with BA 131. and then look at the leaf element at offset 7 with FLT of 3. This FLT will be of value 3 which 
is then forwarded to the Buffer Manager and eventually the Egress bandwidth manager. 
Packet with Mask 255.255.0.0 

This subsection will huild upon the parse graph in Fig. 20 and install a packet with an illustrative mask 
255.255.0.0 and address of 4.R.x.y. In this case, the software will go to the 4* offset in the originating node and find 
a continuation clement with BA of I 29. The software will then go to offset 8 in block I 29 and find a default FLT 
offset. At this time the software know, that it has to install a new FLT (say 4)offset in the Z h offset of block 129. 

The hardware when receives any packet with the header 4.8.x.y it will look into the 4* offset originating 
node nnd find a continuation clement with B A of 1 29 . then look at the leaf clement of block with FLT of 4. and 
terminate the search. In this case the hardware will do only 2 lookups. 
Complex Filtered Packet 

Now assume that there was a requirement to filler a packet with header 4.5.6.8.9.x.y.z.ll. There arc no 
restrictions to the above concept of parsing the packet, and (he time it takes to parse the packet will increase since 
the hardware will have to read and compare 9 bytes. Iht hardware will .imply keep parsing however until il sees a 
leaf element. The x.y.;. bytes arc blocks which contain continuation elements pointing to the next block with all 
continuation elements of* pointing to block y. all continuation elements or y pointing to block r. and all . 
continuation elements of 7. pointing to the block which has enlry 11 as a leaf, and the rest being default. This is where 
the fork element comes into play and may be called u P to lookup the forwarding at the end of search 4.5.6.8. 
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Removing Simple IP Multicast Packets 

7*c removal of packet, i, si.mlar to the reverse of adding address to the parse graph, above explained. T*e 
psuedocede for removal in this embodiment is as follow*: 

Walk down in end of leaf remembering each block address and offset in block. 

FOR ( Prom Leaf node lo originating node) 
IF ( only clement in block) 

set default FLT offset at the previous NODE offset address 
free the last block 
go to previous block 

ELSE 

set default FLT offset at last leaf, 
cjit 

TiNDIF 
END FOR 

Egress Bandwidth Manager 

Every I/O Module connects a NcoN P ort to one or multiple physical pons. Each I/O Module supports 
multiple traffic priorities injected via a sing.e physical NeoN Port. Each traffic priority is assigned some bandwidth 
by a network manager, as illuMroicd in Fig. M. being labeled as the "QoS (Packet & Cell)-. T^e purpose of this 
section is to define how bandwidth i, managed on multiple traffic profiles. 
NcoN Queuing Concepts 

T*c goal of NeoN Queuing, of the invention, thus, is to be able to associate a fixed configurable bandwidth 
with every priority queue and aUn ,„ ensure maximum line utilisation. Traditionally, bandwidth enforcement is done 
in systems by locating a fi xed number of buffers per priority queue. This means that the enqueing of data on the 
Priority queues enforces bandwidth allocation. When buffers of a certain queue are filled. .hen data for that queue is 
dropped (by not enqueuing data on that queue), this being a rough approximation of the ideal requirement. 

T^ere arc many real life analogies lo understanding the concept of QoS of the present invention, e.g. cars on 
n highway with multiple entry ramps or moving objects on a multi-channeled conveyor in a manufacturing operation. 
For our pulses, let us examine the simple case of "cars on a highway". Assume that 8 ramps were to merge into 
one l,ne a, so mc po.n, on the hich.ay. In real life experiences, everyone knows that this could create traffic jams. 
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Bu, if managed correctly (i.c with the right QoS). then the single highway lane can be utilized for maximum 
efficiency. One way to manage this IW j$ , 0 have no control, and have ii be serviced on a first come, first serviced 
method. This means that there is no distinction between an ambulance on one ramp and someone headed to the beach 
on another ramp. But in the methodology of the invention, we define certain preferential characteristics for certain 
entry ramps. There are different mechanisms that ue can create. One is to send one car from each entry ramp in , 
round robin fashion, i.e. each ramp is equal. This means counting cars. But if one of these "cars" turns out to be ■ 
tractor trailer with 3 trailers, then in fact equal service is not being given to all entry ramps as measured by the 
amount of highway occupied. In fact if one entry ramp is all tractor trailers, then the backup on the other ramps could 
be very significant. So it is important to measure the size of the vehicle and its importance. The purpose of the 
••traffic cop" (al:a QnS manager) is to manage which vehicle has the right of way. based on size, importance and 
pe.hnps lane number. The "traffic cop" can. in fact, have different instructions every other day on the lane entry 
characterises based on what the "town hoi. manager" aka network manager has decided. To conclude the concept 
of QoS understand^. QoS is a mechanism uhich allows certain datagrams pass through queues in a controlled 
manner, so as to achieve a deterministic and desired goal, which may vary from application to application e.g. 
bandwidth uiil.zation. precision bandwidth allocation, low latency, low delay, priority etc. 

The NeoN Queuing of the invention handles the problem directly. Neon Queuing views the buffer 
allocation as an orthogonal parameter to the Queuing and bandwidth issue. NeoN Queuing will literally segment .he 
physical wire into small time units called "Time Slice" fas an example, approximately 20O nanoseconds on OC48 - 
time of M byte packet on an OC48). Packets from the baek-plane are put into the Priority Queues. Each time . 
packet is exacted from a queue, a times.amp is also tacked along with that queue. The time stamp indicates distance 
in timcfrom a -Current Time Counter' in Time Slice Units, and when the next packet should be dequeued. The 
'distance in time" is function of a, packet size information coming in from the back plane, b) .he size of Slice Time 
i.*«f »nd c, the bandwidth Minted for the priority queue. Once a packet is dequeued, another counter is updated 
which represents the Next Time ,o Dequeue (NTTD) - such purely .a func.ion of .he size of the packet jus, de- 
queued. NTTD is one for cel.-based cards, because all packets nre the same size and fi, in one buffer. This realty 
proves ,ha. the NeoN Egress Bandwidth Manager is monitoring the line to determine exac.ly what nex. to send. This 
mechanism, therefore, is a bandwid.h manager rather than jus, a dequeuing engine. 
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7V NcoN Queuing of the present invention, moreover. may he .hough, of a, TDM scheme for allocating 
bandwidth Tor different pr.oritic,. u,ing priority queuing Tor ABR (Available Bi, Rate) bandwidth. Added 
advances of the NeoN Queuing are that, within the TDM mechanism, bandwidth i, ca.cu.ated not on packet 
count" b,- on packet hytc m,c". 1 his granularity ls , much bcltcr rcp , jc3 „ f M ^ 
allow, „« bandwidth calculates rather than simula.cd/appro.sima.inns. TUc second "NeoN Advantage" is ,ha, the 
Network Manager can dynamically change the bandwidth requirement, shni.nrlv ,o a s.iding sca.e on a volume 
control. TOs is fcasih.c since ,.,e bandwidth captions f„ r pr iori, y qiICU „ , rc nm „ a|| ^ ^ 
In Neon Qucu.ng. rather, the bandwidth allocation is based on the time slicing the bandwidth on lhe physical wirc . 
This tvpe of bandwidth management is abso.u.e.y necessary when running a, very high line speeds, to keep .ine 
utilization high 

Mathematics Used during Queuing 

First we will develop the variables and constants being used in the ultimate math, 



emaucs. 



Symbols j Description " — 




Time 5l.ee of bandwidth on w trc used for calculations. <200nSec 
for OC48). 


NTTS 


Next Time To Send. Th.s number in units of Ts representing a 
adJrcss to dc-queuc from current time. 


BilTimc 


T.mc per.od of a s.ngle b.t on the wire of the current I/O module 


An 


Delay factor in Number of TS, representing bandwidth 

calculations set by Network Manager, for priority Queue n. 


BWn 


Bandwdth of Queue n in Percentage as entered or calculated by 
the CPU Software. 


Pn 


Number of Priority Queues. 


TBW 


Total Bandwidth of the wire 


NTTD 


Next 1 imc To Dequeue. 


CT 


Current Time in TS units. " " 
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Consider first ,hc user interface level .0 sec ho,- bandwidth is nHoca.cd among* various priorities. ,hc user 
is nnrmnl.v given , nc j„h of divid.ng 1009, h.indwi,l.h among*, various priorities. The uscrcou.d also be presented 
ui,h breaking u P the entire bandwidth in bits per second (as an example for OC48. it would be 2.4Gbits). In either 
case, some CPU software caleula.es a number pair, priori* -on. from Apriority or mBits/seopriority. Since the CPU 
is doing this emulation, i. can he easily changed based on the I/O module. The Bandwidth Manager does no. need .0 
know about the I/O modu.c type, only caring about the nriori,y- An pair. Thus if a user connected «o the NeoN por, 
that cannot handle data a. full line rate, the CPU can change this value to adjust for «he customer requirements. 
4n= 100/BWn 

(I) 

Data (in form of packet address) from the priority queues is dequeued on the ou.om fifo Th, A, 
queue engme calculator, of the Next Time To Send for that queue is governed bv eouMion f 2 hill 4 
.one^h number fo, each queue, which gets updated ev.? y time a^acke, Is do ^^cZ^Z 

NTTS, = ((Packet Byte Count • BitTimc) / (TS)) • £„) + NTTS. , (2) 
Kct^ C NtS", Jo H ™T\ ^ "* ^ C ™ °" dc " e " din S °« I/O Module 

NTTD. = ((Packet Dyic Count • BitTimc) mnd(TS)) + CT (3) 
Queuing Processing 

1. i< now in order to decide the processing needed to queue addresses from the back-p.ane on to the Priority 
Queues, Tig 24. which depicts the overall queuing and schedu.ing process. Control Data, which includes datagram 
addresses, from the NeoN Contro. Data Switch", is sorted into priority queues based on the QoS information 
embedded in the control Data, by the Queue Engine. The Scheduling Engine operation is rendered independent of 
•he Queue Engine which schedules datagram addresses through use of ,he novel algorithms of .he inven.ion toed 
further below. 

Tlic queuing Engine h.is the following tnsk.v 

sc, J>ZT Cj ' CUb,i0nS Ca ' CU,a,e W ' ,Cn 10 b3Ck « "* °*< d on 

Drop Packets Star. Dropping packets when .he Priority Queues arc full. 
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For each Priority Queue P.. Ihcre will be a 'head pointer - pHcauV and a 'tail pointer - pTail/. Input Fifo 

feeds the priority Queues P. with buffer address from the back-plane. Additionally, there is a forward. For OC48 

rates, and assuming 64 hytc packets 35 average size packets, ihc following processing will be done in about 

200nSecs The preferred pscudo code of the invention for the En queue Processor is as follows? 

Read input Fifo. 

Find priority ol the packet 

IRrnom on queue) 

move buffer from Input Fifo to *pTail„ priority queue. 

Advance pTail n . 

update statistics 

increment buffer count on queue 

IF(packct count on >= watermark of that queue) 

set hack-pressure for that prioritv 

update statistics 

FNDIF 

ELSE 

move buffer from Input Fifo to drop queue. 
Update statistics 

ENDIF 

The verbal explanation of the psucdocode listed above. As each control packet is read from the 'Neon 
Control Data Switch' it is put onto one of N queues after it is verified for physical space available on the queue. If 
there is no mom set on the queue the data is put on a drop queue, which allows the hardware to return addresses back 
to the originating port via the *NcoN Control Data Switch*. Also a watermark is set. per queue, to indicate to the 
ingress to filter out non-preferred traffic. This algorithm is simple but needs to be executed in one TS. 
Scheduling Processing 

This section will lis. the algorithm used to dc-qucuc address from Priority Queues Pn onto the output fifo. 
This calculation also has to he done during one TS. 

Wait here till CT == NTTD AND no hack pressure from output f.fo. // sync up 

X = FALSE #, . . 1 

r-^r, .. ~ // some variable. 
FOR (all P. .High to Low) 

IF (pHcad* != pTailJ 

if(ct>= hnrs.) 

Dc-Qucuc (pHcad.) 

Calculate new NTTS* // scc equation (2) above. 

Calculate NTTD // scc cqualion (3) ftnovc . 

update statistics 

X=TRUE 

ENDFOR 

ENDIF 
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EN DIP 
ENDFOR 
IF(X== FALSE) 

FOR f all P.. High to Low) 

!F(pHcnd )l !=pTailJ 

DcQucuc (pTailJ 
update statistic* 
X=TRUE 
ENDFOR 

ENDIF 
ENDFOR 

ENDIF 

IF(X=FALSE) 

update statistics 

ENDIF 
Update CT 

Tne function Dc-Queuc is conceptually a simple routine. listed below; 

Dc-Queuc(Q„) 

•pOutputQTailf 4 = *pHc3d„++ 

T*e explanation of the psucdocodc listed above is that there are two FOR loops in the algorithm - the first 
FOR loop enforcing the committed bandwidth to the queue, and the second FOR loop serving for bandwidth 
utilization, sometimes called aggregate bandwidth FOR Loop. 

Examining firs, .he Ccmmincd TOR Loop. ,he queues are checked from ,he Highest Priority Queue to .he 
Lowes, Priority Queue for available datagram ,o schedule. If a queue ha, available datagram, the algorithm will 
check to see if the Queues Time ha, ,„ dequeue, hy comparing its NTTS„ against CT. If ,he NTTS„ has fallen behind 
CT. then the queue is Dequeued: otherwise, the search goes on for the next Queue un.il all queues are cheeked. If a 
da,a frn.n , queue is scheduled to g„ ou.. a new NTTS. is calculated for .ha, queue and a NTTD is always calculated 
when any queue is dequeued. When a Network manager assigns weigh, for the queues. ,he sum of all weights should 
not be .007, Since NTTS. is based on datagram si,e. ,he ou.pu, data per queue is , very accurate indentation of 
«he bandwidth set by the manager. 

Let us now examine ,he Aggregate FOR Loop. This loop is only executed when no queue is de-queued 
during ,. le Commiued FOR .onp. .n other words only one dequeue opcra.ion is performed in one TS. In this FOR 
Loop. ,1. queues are checked from Highest Priority ,o Lowes, Priority for available data ,o dequeue. The algorithm 
go, in this FOR Loop for one of .wo reasons: either .here was no dat, in a.. ,he queue.,, or ,he NTTS. of a!, queues 
u ere still ahead of CT (i, was no, ,imc to send). If the algorithm entered ,he aggregate FOR Loop for empty queues 
-hen ,he second ,ime around ,he fate wil, he ,he same. However if the aggrega.e FOR Loop was entered because the 

-33- 



WO 99/35577 PCT/IB98/01940 



KITS. not reached for .,11 queues <hen .he aggregate will find the highes. prioriiy such queue and de-queue it. 
also in lint case it would upd.ne NTTS. and calculate NTTD. 

The algorithm has built in credits for queue that do not have data to de queue in their lime slot: and debits 
for data ihm it de-queucd in the Aggregate Loop. These credits and dchi.s can accumulate over large periods of lime. 
The debit and credit accumulation time is a direct function of the sire of NTTS. Held in bils. for example a 32 bit 
number would yield 6 minutes in each direction at using 160nSec as TS (2 ): - !60nSec). Each individual queue 
could he configured lo loose credits and/or debits, depending on the application this algorithm is used. For example 
if the algorithm was to be used mainly for CBR type circui.s one would want lo clear the debits fairly quickly, where 
as for bursty traffic they could be cleared rather slowly. The mechanism for clearing debits/credits is very simple, 
asynchronously selling NTTS. to CT. If NTTS. is way ahead of CT. Queue has build a lol of debit, then selling .he 
NTTS. to CT would mean loosing all .he debit. Similarly if NTTS. had fallen behind CT. Queue has build a lot 0 r 
Credit, then selling NTTS. lo CT would mean losing all the credit. 

Example of Implementing CBR Queue Using the Algorithm 

It is now appropriate in examine how l0 huild a CBR queue out of the algorithm listed above, again 
referencing Fig. 24. Let it be assumed that the output wire is running a. OC4S speeds (2.4 G bits Per second) and that 
Queue I (highes. Prioriiy Queue) has heen assigned lo be the CBR Queue. The way we configure the weight on the 
CBR queue is configured hy summing all the input CBR Row bandwidth requirements. For sake or simplicity there 
are 100 flows going through the CBR Queue, each with a bandwidth requirement or 2.4 Mbits per second. The CBR 
Queue bandwidth will .hen be 2.4Mbi.s/scc Times 100. i.e. 240Mbits per second (i.e. 10%). In o.her*words, 

QRATE, N = Z Ingress Flow Bandwidth. 

A.= 100/10= 10. Based on Equation I 

NTTS. would result in 10 every time a 45 hyle datagram is dequued. - Based on Equation 2. 
. NTTS. would result in 20 every time a 90 byte datagram is dequued. - Based on Equation 2. 

NTTD would result in I every time a 45 byte datagram is dequeued. - Based on Equation 3. 

NTTD would result in 2 every time a 90 byte datagram is dequeued. - Based on Equa.ion 3. ' 

This show, , hM the queue will be dequeued very timely: based on datagram size and the % of bandwidth 
allocated to the queue. This algorithm is independen. of wire speed, making i, very scalable, and can achieve very 
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high data speeds. This alogorithm also takes datagram size into account during scheduling regardless of a the 
datagram being a cell or a packet. So long as the network Manager sets the weight or the queue a, the sum of all 
ingress CBR flow bandwidih. the algorithm provides the scheduling very accurately. 
Example of Implementing UBR Queue Using the Algorithm. 

h is very simple to implement a UBR queue using ,his algorithm. UBR standing for the queue 
which uses the left over bandwidth on the wire. To implement this type of queue, one or N queues with 0% 
Bandwidth, and then this queue is dequeued when there is literally no other queue to dequeue. The NTTS will be 
set so far in the future that after the algorithm dequeues one datagram the next one is never scheduled. 

QoS Conclusion 

As has been demonstrated, the algorithm of the invention is very precise in delivering bandwidth, and its 
granularity is based on the si,e «,f TS being independent orCell/Packe, information, and also provides all of the 
ATM services required: implying not only packets also enjoy the ATM services but cells and packets coexist on the 
same interface. 

Real Life Network Manager Examples 

This section will now consider different Network Management bandwidth management scenarios,,!! well 
handled by the invention. Inso far as the NeoN Network controller i, concerned, there arc n queues egress (as an 
example it could be R). each queue being assigned a bandwidth. The Egress Bandwidth Manager will deliver that 
percentage very precisely. The Network Manager can a.so decide no, to assign 100% of the bandwidth to all queues, 
in which case the left over bandwidth will simply be distributed on a high to low priority basis. Besides these two 
levels of control, the Network Manager can a.so examine statistics per priority and make strategic statistical 
decisions on it own and change percentage allocations. 
Exemplary Case I : Fixed Bandwidth 

In this scenario. ,00* of the bandwidth fe djvided ^ ^ ^ (f ^ ^ ^ ^ ^ ^ ^ 

queues wi„ behave exactly like Fair Weighted Queuing. TTe reason for this is that - the Egress Bandwidth Manager 
•81 deliver the percentage of the line bandwidth „ requested by ,he Network Manager, and since the queues are 
nev„ em P ,y. the egress bandwidth does no, have time to execute the second FOR .oop (Aggregate Loop), above 
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discussed. 



If the qae «, « no, r„„ all the time, however. ,hc„ during ,he time tllc queue is emp , y sQm£ Qihcr qu£ue 
may be serviced abend of its limc without a charge against i,s bandwid.h. 

A, an example,,.. Newark Manager decided ,o a.loca.e .2.5 % bandwid.h ,o ever, one of ,he eigh, 
queues, .hen ,he Ne.work Manager h „ ,„ provide ((J , hc Egrcsj 

A„-Priori.y Lis. of alt A. one for each priority. 

Bit Time Based on I/O Module Egress Bandwidth Manager is running on . 

For a bandw idlh 0 f ,2.5 ft. *. , CU ,d ca.cu.a.e to be 8.00 (,00/1 2.5>. For a OC4 8 Bit Time wou.d ea.cu.a.e 

lo be 402 pscc. 

Exemplary Case 2: Mixed Bandwidth 

In .his example, no, a.l cf.the bandwid.h is divided into a.l of the queues. In fact. ,he sum of all fixed 
bandwidth on .he q ucucs , no. 1 00,, of .he bandwith avai.able. The Egress bandwidth Manager wi„ deliver ,he 
constant bandwidth on the queues u P to the a„„c,ed amount, and then aggregate traffic amongs. the priorities on , he 
naming bandwidth. This guarantees some percent of a class of traffic to make i, through ,he por, and a.so provides 
pr ,ri,i,ed traffic. For qucU es tha, a, no, ful, during the a.located ,me. that bandwid.h wi„ be lost t0 the a?grega(e 
bandwidth. 

Exemplary Case 3: No Mixed Dandwidih For All Queues 

.- ^i, scenarios is abated as fixed bandwidth for al, q ueucs. T* queues wil, .hen behave purely ,ike 
pnori,i,ed queuing. The firs, For Loop listed in section 0 Schedu.ing . wi„ considered „ NOP. 
Exemplary Case 4: Dynamic Bandwidth 

.his i.lustratio, ,he Network Manager may .„i,ia„y come up w ilh No Mixed Bandwid.h for a„ Queues 
- -Hen. as s.ans ,o bu,d commi„ed bandwidth circuit may crea.e fixed bandwid.h oueues. The sum of .He 
re q Uif emen,s of bandwid.h of .he flows a, an ingress pon wou.d dictate .he size of ,he cons.an, bandwidth on .He 
cress por, Tnc granu.ari.y of .he abatable egress bandwid.h is ,argc,y dependent on ,he dcp.h of ,he floa.ing 
P- dep., As an examp.e. i. may be assumed tha, ,w 0 decima, p.aees may suffice. TCi ,hcn imp.ies of one 
pcrcen.. and uou.d ca.cu.ate .o be 240kHz for an OC48 .ine and 62 kHz for an OC.2 U„e. 
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h should be otaerv*. ,h,, „ le nh ovc „« arc cx, mples onl> , and Ihc 3rfl|lcation „ f ^ ^ ^ 

invention is not limited to these cues 

Further modification* will occur to those skilled in thi, art. and such nre centered to foil within 
the spirit and scope of the invention a. defined in the appended chirm. 
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Cl.iims 



• •A me.hod of sim U |,ancou,,v processing infor^.ion confined i„ da. a ce.Ls and do.n pa Cce,s or frames received 
» *. egress of a da.a ncworking s y s,cm. .In, comprise, applying bo.h ,he received da. a cells and data p ackets « 0 a 

Cnmm °" *" SW " Ch • COn,rnMing M "* ^ la -for W n r di ng ;„di,rimi„ al i ngIy u , ng ne|wofk 

hardware and a.gori.hms for forwarding. based on contro| informa , jon con|ained jn ^ ^ ^ ^ ^ ^ 

.-forming pac^cs inlo eelh: and comrolIinp wi|h a bin<J wid(h ^^^^^^ 3[?of j bo(h ^ ^ ^ 

forwarding wi,hou. impac.ing the correct forwarding cha/ac.eris.ics of ei.her. 

2. A me.hod a, Caimed in claim . .herein ,he cell and parte, con.ro. infor^.ion is processed i„ . common 
forwarding enfinc wilh common aI?0fi|hms indcpendcm of contex( . scns . . ve Mwaa cont3 . ned . n ^ ce)i ^ ^ 

3. A me.hod as claimed in claim 2 herein ,he inform al ion from .he forwarding engine is passed ,o a ne.work 
egress qu e„c manager and .hence , a nc.wo* egress .ransmi, facili.y and in n manner such as ,o provide minimum cel. 



delay %'ariaiion. 

<-A mc-hod as Caimed in Cim 3 wherein-oua,,,, of service informal is inc.uded in ,he informal passed 
from .he fo,wa, d ,ng engine and managed bv ,he quC ue manager for bo.h cc.ls and pacKc.s simuhaneously and based upon 
ihc common algorithm. 

5- A me.hod a, Caimed in claim 4 wherein a common parsing a.gnri.hm is a.so used for S imilarl y forwarding bo.h 
cell dnta and dnu packets. 

*.A me.hod as c.a.mcd ,n Cairn 4 wherein .he qu e U ing managing employ, processing ,ha, ope.ra.es as each con.ro. 
Pxto « read from .he swi.ch. ,o p., ,he same in,o one of a p.ura.i.v of oucues af.cr U is verified .ha. *vai, ible P h y sica. 

space exists on ihc queue. 

7.A mchod as Caimed in claim 6 wherein, should .here He no such space. ,hc data is p ut in a drop queue , nd 

returned by the switch to Ihc ingress of the network. 

«-A mchod as Caimed in Cain, 7 wherein a wa.ermar, is sc, for each qu eue ,o ins.ruc each ingress ,o Hher ou, 

non-prefcrred data traffic. 

9- A mohnd a, claimed in claim « wherein handwid.h is a.loca.cd for differen, P r iori.ies H y parte, by.e si.e and 

nn<ed upon time slicing the bandwidth. 
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10. A method js claimed m claim 9 wherein Ihe network mjnj i; ir dynamu oily varii-s the bandwidth 
requirement. 

11. A mcthoJ of proccssu,g information contained in data cells and data packet* received at the ingress of a 
data networking system, thai comprise, applying ho 'h «hi- received data cells and data packets U» a cooimun 
data forwarding and routing switch; managing both Cell and packet data switching in Ihe common switch using 
common hardware, common quality of service algorithms, and common forwarding algorithms; and 
controlling the packet switching independently of and without interfering with the , ell data switching 

12. A method of processing packets of information from a forwarding switch and M uo,.e managin K the 
forwarding of the same, that comprises, us each packet is read from the switch, pulling the same into one of a 
plurality of queues after it is verified that available physical space ex.sts in the queue; placing the packet 
information in a drop queue should there be no such space and rclurnmg the packet information through the 
switdv selling a watermark for each queue to enable the filtering of non-prcferrcd information traffic; und 
allocatur for different priorities by packet byte size and based upon time slicing the- bandwidth. 

13. A system architecture apparatus for simultaneously processing information , ontaim-d in d..L. cells and 
data packets received at the ingress of a data networking system, said apparatus h., vine., in , ombinalion. means 
for applying both the received data cells and data packets from the ingr^s to a common data switch within the 
system; means for controlling the switch forced and packet indiscriminately, for forwarding, by a common 
algorithm based on control information contained in the cell or packet and without transforming, packets ,nlo 
cells; and means for conlrolling will, a common bandwidth management algorithm both cell and packet data 
forwarding without impacting the correct forwarding, characteristics of either. 

14. Apparatus as claimed „, dam, 13 wherein the cell and packet control inforn,al,o„ .s processed in a 
common forwarding engine with common algorithms, independent of Context-sensitive information contained 
in the cell or packet. 

15. Apparatus as claimed in vUim 14 wherein means is provided for parsing the infurmalion from the 
forwarding engine to a network egress queue manager and thence to a network egress transmit facility, and in a 
manner such as to provide minimal cell/packet delay variation. 

16. Apparatus as claimed in claim 15 wherein quality of service informal,,.,, ,s included in the mformal.o,, 
passed from the forwarding engine and managed by l|„. q uru , mjnj ,,. r hf , n , lh ^ JnJ pji kl . ls 

simultaneously kiseJ upon i\w nimmon Jlj'.urilhm. 
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17.Appar3.us as claimed in claim .6 wherein , common pj „ jn? . $ 3|so ^ 

forwarding both cells and data packets. 

.8.A r para,us is chimed in claim . 6 wherein ,he queuing managing emp.oys processing ,ha, cpera.es « each 
con.ro. poc.ee, is read from .he , wi.ch. , 0 pu. ,he same in.o one of a p.uro.i.y of Queue s aflcr i( i$ verified avai , ab|e 

physical space exists on the queue. 

.9.A r parn,us „ chimed in claim . , wherein, shou.d .here he no such space, means is provided for .he data ,o be 

put in n drop queue and returned by tt.c switch to the ingress of the network. 

20. Apparatus as claimed in claim ,9 wherein , wa.erma* is sc. for each queue .o ins.ruc. such ingress .o fi.ter 

out non-preferred data traffic. 

2..Appara,us as Caimed in Cairn , , .herein means is provided for al.oca.ing bandwid.h for diffcrem fey 
packet byte size and based upon time slicing the bandwidth. 

22. Appara.us as Caimed in claim 2 ■ wherein ,he nc.wo* manager dynamical* varies .he bandwid.h reouiremen, 

23. Appara,us as Caimed in Cairn .4 wherein .he cel. da.a is of ATM fixed size uni.s and ,he packet da.a is of 

arbitra/y size. 

24. Appara.us a, Caimed in chin, 1 4 wherein, between .he ingress and .he switch. , VC! func.ion, assemb.y is 
interfaced. 

25. Appara-us as Caimed in Cairn 24 wherein said assemb.y connec.s no. on.y ,o ,he swi,h bu. a.so .o a heade, 
.oo ku p and forwarding engine f.,r ho, lhe cc „ an d packet da,; wi.h .he engine connecting through a con.ro, da.a swi.ch 
and a quality of service managing modu.c ,o a buffer, a.so inp u .. lng f r0 m .he ou.pu, 0 f lhc $wilch . 

26. Apparatus as claimed in claim 25 wherein the buffer feeds a cell da.a VP * • • 

iccos a ecu aata VC shaping circuit thai connects with the 

system egress. 
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